Text Classification for Marathi Documents using Supervised Learning Methods
نویسندگان
چکیده
منابع مشابه
Text Passage Classification Using Supervised Learning
In this paper, we describe a method for text passage classification or extraction by means of supervised machine learning and analytically identifying passages. The underlying characteristic of the method lies in the utilization of the resulting classification, which leads to the classification of the portion of a document in a high dimensional feature space into a low dimensional space which i...
متن کاملPartially Supervised Classification of Text Documents
We investigate the following problem: Given a set of documents of a particular topic or class P , and a large set M of mixed documents that contains documents from class P and other types of documents, identify the documents from class P in M . The key feature of this problem is that there is no labeled nonP document, which makes traditional machine learning techniques inapplicable, as they all...
متن کاملSemi-supervised learning for text classification using feature affinity regularization
Most conventional semi-supervised learning methods attempt to directly include unlabeled data into training objectives. This paper presents an alternative approach that learns feature affinity information from unlabeled data, which is incorporated into the training objective as regularization of a maximum entropy model. The regularization favors models for which correlated features have similar...
متن کاملSoft-Supervised Learning for Text Classification
We propose a new graph-based semisupervised learning (SSL) algorithm and demonstrate its application to document categorization. Each document is represented by a vertex within a weighted undirected graph and our proposed framework minimizes the weighted Kullback-Leibler divergence between distributions that encode the class membership probabilities of each vertex. The proposed objective is con...
متن کاملSupervised Methods for Domain Classification of Tamil Documents
The Era of digitization induces the need of domainclassification in both the on-line and off-line applications. The necessity of automatic text classification arises for utilizing it in diverse fields. Hence various methodologies like Machine Learningalgorithms were proposed to do the same. Here automatic document classification of Tamil documents have been proposed by considering the exponenti...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: International Journal of Computer Applications
سال: 2016
ISSN: 0975-8887
DOI: 10.5120/ijca2016912374